Mathematical and Computational Linguistics
MAT1509HS: Mathematical and Computational LinguisticsMAT1509HS: 数学与计算语言学
Winter 2019: University of Toronto, BA6180, Tuesday 4-6 pm and Wednesday 4-5pm
2019 年冬季:多伦多大学,BA6180,周二下午 4 点至 6 点,周三下午 4 点至 5 点
Instructor: Matilde Marcolli
(Alexander Calder, "Four Red Systems", 1960)
(亚历山大·考尔德,“四个红色系统”,1960 年)
Brief Course Description 课程简介
The class will cover mathematical and computational models of acquisition and evolution of natural languages. We will discuss learnability questions, Markov chain models, population dynamics models, evolutionary behavior, communicative efficiency and fitness, We will focus in particular on the Principles and Parameters model of linguistics and we will discuss the use of mathematical methods, involving algebraic geometry, topology, and statistical physics, to describe the evolution of natural languages. Specific examples from historical linguistics will be revisited from a mathematical and computational perspective.
该课程将涵盖自然语言的获取和进化的数学和计算模型。我们将讨论可学习性问题、马尔可夫链模型、群体动力学模型、进化行为、交际效率和适应性,我们将特别关注语言学的原则和参数模型,我们将讨论使用数学方法,包括代数几何、拓扑学和统计物理学,来描述自然语言的进化。将从数学和计算的角度重新审视历史语言学中的具体例子。
Slides of Lectures 讲座幻灯片
- What is Linguistics?
什么是语言学? - What is Linguistics? Part II
什么是语言学?第二部分 - What is Linguistics? Part III
什么是语言学?第三部分 - Language Change or Evolution?
语言变化还是进化? - Phylogenetic Algebraic Geometry and Syntax
系统发育代数几何和语法 - Spin Glass Model of Syntactic Parameters
语法参数的 Spin Glass 模型 - Towards a Geometry of Syntax
迈向语法的几何 - Persistent Topology of Syntax
语法的持久拓扑 - Formal Languages: Part I
形式语言:第一部分 - Formal Languages: Part II
正式语言:第二部分 - Parse trees: from formal to natural languages
解析树:从正式语言到自然语言 - Probabilistic Linguistics
概率语言学 - Graph Grammars 图形语法
- Minimalism and Merge Grammars
极简主义和合并语法 - Language and Complexity
语言和复杂性 - Natural Language Processing
自然语言处理 - Semantic Spaces 语义空间
- Models of Language Acquisition
语言习得模型 - Models of Language Acquisition: Part II
语言习得模型:第二部分 - Language Acquisition: Parameter Setting
语言习得:参数设置 - Models of Language Evolution
语言进化模型 - Models of Language Evolution: Part II
语言进化模型:第二部分 - Models of Language Evolution: Part III
语言进化模型:第三部分 - Symbolic and Statistical Approaches to Language
语言的符号和统计方法 - Topological Analysis of Syntactic Structures
句法结构的拓扑分析
Reading Materials 阅读材料
Links to articles and reading suggestions for presentations:
文章和演示文稿阅读建议的链接:
- pdf Noam Chomsky, "Three models for the description of Language"
PDF 格式 Noam Chomsky,“描述语言的三种模型” - pdf C.E.Shannon, "Prediction and Entropy of Printed English"
PDF 格式 C.E.Shannon,“印刷英语的预测和熵” - pdf D.Link, "Traces of the Mouth: Andrei Andreyevich Markov's mathematization of writing"
PDF 格式 D.Link,“嘴的痕迹:安德烈·安德烈·安德烈耶维奇·马尔可夫的数学化 的写作” - pdf R.C.Berwick, "Mind the Gap"
PDF 格式 R.C.Berwick,《Mind the Gap》 - pdf Partha Niyogi, Robert C. Berwick, "A dynamical systems model for language change"
PDF 格式 Partha Niyogi, Robert C. Berwick, “语言变化的动态系统模型” - pdf L.Pacher, B.Sturmfels, "The Mathematics of Phylogenomics"
PDF 格式 L.Pacher, B.Sturmfels,“系统发育基因组学的数学” - pdf G. Longobardi, C. Guardiano, G. Silvestri, A. Boattini, A. Ceolin, "Towards a syntactic phylogeny of modern Indo-European languages"
pdf G. Longobardi, C. Guardiano, G. Silvestri, A. Boattini, A. Ceolin, “迈向现代印欧语系的句法系统发育” - pdf G. Longobardi, C. Guardiano, "Evidence for syntax as a signal of historical relatedness"
pdf G. Longobardi, C. Guardiano, “句法作为历史关联信号的证据” - pdf K.Ehret, B.Szmrecsanyi, "An information-theoretic approach to assess linguistic complexity"
PDF 格式 K.Ehret, B.Szmrecsanyi, “一种信息论方法 评估语言复杂性” - pdf M.Bane, "Quantifying and measuring Morphological Complexity"
PDF 格式 M.Bane,“量化和测量形态复杂性” - pdf A.Kaltchenko, "Algorithms for estimating information distance with applications to bioinformatics and linguistics"
PDF 格式 A.Kaltchenko,“估计信息距离的算法 在生物信息学和语言学中的应用” - pdf M.Belkin, P.Niyogi, "Towards a theoretical foundation for Laplacian-based manifold methods"
PDF 格式 M.Belkin, P.Niyogi, “迈向基于 Laplacian 的理论基础 流形方法” - pdf P.Breiding, S.Kalisnik, B.Sturmfels, M.Weinstein, "Learning algebraic varieties from samples"
PDF 格式 P.Breiding, S.Kalisnik, B.Sturmfels, M.Weinstein, “从样本中学习代数变种” - pdf A.Auffinger, A.Lerario, E.Lundberg, "Topologies of random geometric complexes on Riemannian manifolds in the thermodynamic limit"
PDF 格式 A.Auffinger, A.Lerario, E.Lundberg, “热力学极限下黎曼流形上随机几何复合物的拓扑” - pdf R.Clark, "Kolmogorov complexity and the information content of parameters"
PDF 格式 R.Clark,“Kolmogorov 复杂性和参数的信息内容” - pdf A.K.Zvonkin, L.A.Levin, "The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms"
PDF 格式 A.K.Zvonkin, L.A.Levin, “有限对象的复杂性和 通过 算法理论” - pdf R.Sproat, M.Yarmohammadi, I.Shafran, B.Roark, "Lexicographic Semirings"
PDF 格式 R.Sproat, M.Yarmohammadi, I.Shafran, B.Roark, “词典分环” - pdf E.P.Stabler, "Computational perspectives on minimalism"
PDF 格式 E.P.Stabler,“极简主义的计算观点” - pdf T.Hunter, C.Dyer, "Distributions on Minimalist Grammar Derivations"
PDF 格式 T.Hunter, C.Dyer, “极简语法推导的分布” - pdf P.beim Graben, S.Gerth, "Geometric representations for minimalist grammars"
PDF 格式 P.beim Graben, S.Gerth, “极简语法的几何表示” - pdf S.Giraudo, J.G.Luque, L.Mignot, F.Nicart, "Operads, quasiorders and regular languages"
PDF 格式 S.Giraudo, J.G.Luque, L.Mignot, F.Nicart, “Operads, quasiorders 和 常规语言” - pdf T.Ceccherini-Silberstein, W.Woess, "Growth and Ergodicity of Context-free Languages"
PDF 格式 T.Ceccherini-Silberstein, W.Woess, “无语境语言的增长和遍历” - pdf J.Shallit, "Number Theory and Formal Languages"
PDF 格式 J.Shallit,“数论和形式语言” - pdf Eibe Frank, "Formal Languages and Automata", Chapter 6
PDF 格式 Eibe Frank,“形式语言和自动机”,第 6 章 - pdf L.Sennhauser, R.C.Berwick, "Evaluating the Ability of LSTMs to Learn Context-Free Grammars"
PDF 格式 L.Sennhauser, R.C.Berwick, “评估 LSTM 学习上下文无关语法的能力”
Schedule of Final Presentations 期末报告时间表
- Tuesday April 2 (class time)
4 月 2 日星期二(上课时间)- Laurestine Bradford, "Applications of Lexicographic Semirings to Problems in Speech and Language Processing"
Laurestine Bradford,“词典分环在语音和语言处理问题中的应用” - Gal Gross, "Algebraic languages and polyominoes enumeration"
Gal Gross,“代数语言和多联骨牌枚举” - Amjad Mobayed, "Traces of The Mouth: Andrei Anderyevich Markov's Mathematization of Writing"
阿姆贾德·莫巴伊德,《嘴的痕迹:安德烈·安德烈·安德烈耶维奇·马尔可夫的写作数学化》
- Laurestine Bradford, "Applications of Lexicographic Semirings to Problems in Speech and Language Processing"
- Tuesday April 9 (class time)
4 月 9 日星期二(上课时间)- Shuyang Shen, "Number theory and formal languages"
沈书阳,“数论与形式语言” - Sitanshu Gakkhar, "Computational neurolinguistics"
Sitanshu Gakkhar,“计算神经语言学” - David Ledvinka, "Monad Transformers for Natural Language: Combining Monads to Model Effect Interaction"
David Ledvinka,“自然语言的 Monad 转换器:结合 Monad 对效果交互进行建模” - Feodor Kogan, "Operads and formal grammar"
Feodor Kogan,“Operads 和正式语法”
- Shuyang Shen, "Number theory and formal languages"
- Wednesday April 10 (class time)
4 月 10 日星期三(上课时间)- Suleiman Motasem, "Arabic computational linguistics"
Suleiman Motasem,“阿拉伯计算语言学” - Jesse Frolich and Andrew Wilson, "toki pun-a: a computational approach to jokes (i.e. syntactic ambiguity)"
Jesse Frolich 和 Andrew Wilson,“toki pun-a:笑话的计算方法(即句法歧义)”
- Suleiman Motasem, "Arabic computational linguistics"
normal 正常